A packetization and variable bitrate interframe compression scheme for vector quantizer-based distributed speech recognition
نویسندگان
چکیده
We propose a novel packetization and variable bitrate compression scheme for DSR source coding, based on the Group of Pictures concept from video coding. The proposed algorithm simultaneously packetizes and further compresses source coded features using the high interframe correlation of speech, and is compatible with a variety of VQ-based DSR source coders. The algorithm approximates vector quantizers as Markov Chains, and empirically trains the corresponding probability parameters. Feature frames are then compressed as I-frames, P-frames, or B-frames, using Huffman tables. The proposed scheme can perform lossless compression, but is also robust to lossy compression through VQ pruning or frame puncturing. To illustrate its effectiveness, we applied the proposed algorithm to the ETSI DSR source coder. The algorithm provided compression rates of up to 31.60% with negligible recognition accuracy degradation, and rates of up to 71.15% with performance degradation under 1.0%.
منابع مشابه
Scalable distributed speech recognition using multi-frame GMM-based block quantization
In this paper, we propose the use of the multi-frame Gaussian mixture model-based block quantizer for the coding of Mel frequencywarped cepstral coefficient (MFCC) features in distributed speech recognition (DSR) applications. This coding scheme exploits intraframe correlation via the Karhunen-Loéve transform (KLT) and interframe correlation via the joint processing of adjacent frames together ...
متن کاملRobust feature vector compression algorithm for distributed speech recognition
In this paper we propose an algorithm for efficient compression of feature extracted parameters used in speech recognition. The algorithm provides a compression ratio of roughly 1:10 and causes negligible or no loss in recognition performance. It is also shown to be robust against enviromental noise. Combined with an appropriate framing structure, a complete system is obtained, which can be use...
متن کاملQuantization of LSF parameters using a trellis modeling
An efficient Block-based Trellis Quantization (BTQ) scheme is proposed for the quantization of the Line Spectral Frequencies (LSF) in speech coding applications. The scheme is based on the modeling of the LSF intraframe dependencies with a trellis structure. The ordering property and the fact that LSF parameters are bounded within a range is explicitly incorporated in the trellis model. BTQ sea...
متن کاملImproved Modeling and Quantization Methods for Speech Coding
With the advent of 3G Wireless standards and subsequent bandwidth expansion, there is a clear need to design high quality, low complexity compression schemes which are bit-efficient. We have proposed a computationally efficient, high quality, vector quantization scheme based on a parametric probability density function (PDF). In this scheme, speech line spectral frequencies (LSF) are modeled as...
متن کاملPredictive vector quantization using the M-algorithm for distributed speech recognition
In this paper we present a predictive vector quantizer for distributed speech recognition that makes use of a delayed decision coding scheme, performing the optimal codeword searching by means of the M-algorithm. In single-path predictive vector quantization coders, each frame is coded with the closest codeword to the prediction error. However, prediction errors and quantization errors of futur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007